RNA-seq analysis for Jagesh Shah group at Longwood.

Contact Lorena Pantano (lpantano@hsph.harvard.edu) for additional details.

The most recent update of this html document occurred: Thu Feb 2 17:08:53 2017

The sections below provide code to reproduce the included results and plots.

## Warning: package 'knitr' was built under R version 3.3.2

Overview

2017-02-02 17:09:01 INFO::Using gene counts calculated from the Sailfish transcript counts.

Differential expression

remove unwanted variation with ruvseq

Dispersion estimates

Comparison: mice_model_jck_wt


out of 29331 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 29, 0.099%
LFC < 0 (down) : 7, 0.024%
outliers [1] : 0, 0%
low counts [2] : 6750, 23%
(mean count < 1)
[1] see ‘cooksCutoff’ argument of ?results
[2] see ‘independentFiltering’ argument of ?results

NULL

Differential expression file at: mice_model_jck_wt_de.csv

Normalized counts matrix file at: mice_model_jck_wt_log2_counts.csv

MA plot plot

Volcano plot

QC for DE genes p-values/variance

Most significand, FDR< 0.1 and log2FC > 0 : 36

Plots most significand

Plot top 9 genes

Top DE genes

baseMean log2FoldChange lfcSE stat pvalue padj absMaxLog2FC
ENSMUSG00000051397 1242.75431 0.6229989 0.0838691 7.428226 0.00e+00 0.0000000 0.6229989
ENSMUSG00000039043 562.33701 0.2928119 0.0475347 6.159960 0.00e+00 0.0000082 0.2928119
ENSMUSG00000021678 718.54090 0.3667241 0.0602229 6.089448 0.00e+00 0.0000085 0.3667241
ENSMUSG00000036853 903.87767 0.6877411 0.1160272 5.927411 0.00e+00 0.0000174 0.6877411
ENSMUSG00000024479 2146.54521 0.4234369 0.0786794 5.381802 1.00e-07 0.0003337 0.4234369
ENSMUSG00000053977 33.15227 -0.6117497 0.1195484 -5.117172 3.00e-07 0.0011696 0.6117497
ENSMUSG00000069833 4681.72848 0.3295937 0.0727160 4.532616 5.80e-06 0.0188305 0.3295937
ENSMUSG00000024098 3855.34422 0.1982697 0.0445446 4.451041 8.50e-06 0.0241688 0.1982697
ENSMUSG00000026051 154.27817 0.5108982 0.1156679 4.416942 1.00e-05 0.0251669 0.5108982
ENSMUSG00000005131 83.16467 0.4995291 0.1177629 4.241820 2.22e-05 0.0334434 0.4995291
ENSMUSG00000032291 36.05124 -0.4976191 0.1171541 -4.247560 2.16e-05 0.0334434 0.4976191
ENSMUSG00000034456 96.03794 -0.4372152 0.1019052 -4.290412 1.78e-05 0.0334434 0.4372152
ENSMUSG00000042348 1041.36966 0.2703403 0.0633873 4.264900 2.00e-05 0.0334434 0.2703403
ENSMUSG00000045094 59.54043 0.4631928 0.1078332 4.295456 1.74e-05 0.0334434 0.4631928
ENSMUSG00000086807 26.88535 0.5063060 0.1192938 4.244193 2.19e-05 0.0334434 0.5063060
ENSMUSG00000026638 757.34531 0.2013363 0.0483002 4.168434 3.07e-05 0.0433711 0.2013363
ENSMUSG00000063130 82.04451 0.4753051 0.1152858 4.122841 3.74e-05 0.0498076 0.4753051
ENSMUSG00000048834 76.93171 -0.4840475 0.1190624 -4.065494 4.79e-05 0.0602491 0.4840475
ENSMUSG00000098374 386.91352 0.2602137 0.0642747 4.048462 5.16e-05 0.0613942 0.2602137
ENSMUSG00000041782 989.93007 -0.4816925 0.1196945 -4.024348 5.71e-05 0.0643193 0.4816925

Here, I only considered the condition to get the DE genes. These will contain genes where the mean on each condition is different, and will help to detect genes that are always UP or DOWN.

Comparison: mice_model


out of 29376 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 12, 0.041%
LFC < 0 (down) : 21, 0.071%
outliers [1] : 93, 0.32%
low counts [2] : 7276, 25%
(mean count < 1)
[1] see ‘cooksCutoff’ argument of ?results
[2] see ‘independentFiltering’ argument of ?results

NULL

Differential expression file at: mice_model_de.csv

Normalized counts matrix file at: mice_model_log2_counts.csv

MA plot plot

Volcano plot

QC for DE genes p-values/variance

Most significand, FDR< 0.1 and log2FC > 0 : 33

Plots most significand

Plot top 9 genes

Top DE genes

baseMean log2FoldChange lfcSE stat pvalue padj symbol description absMaxLog2FC
ENSMUSG00000038071 471.90441 -1.6987912 0.3718355 42.36484 0.00e+00 0.0000739 Npy6r neuropeptide Y receptor Y6 1.6987912
ENSMUSG00000036853 903.87767 -1.5830983 0.2869988 37.82342 0.00e+00 0.0003157 Mcoln3 mucolipin 3 1.5830983
ENSMUSG00000069833 4681.72848 -0.6672764 0.1298276 37.13760 0.00e+00 0.0003157 Ahnak AHNAK nucleoprotein (desmoyokin) 0.6672764
ENSMUSG00000075588 144.23482 0.7921397 0.2347404 36.35577 1.00e-07 0.0003465 Hoxb2 homeobox B2 0.7921397
ENSMUSG00000031789 35.35275 -3.7971354 0.7057465 35.18578 1.00e-07 0.0004899 Cngb1 cyclic nucleotide gated channel beta 1 3.7971354
ENSMUSG00000036181 619.18393 0.8003920 0.2328268 34.74142 1.00e-07 0.0005067 Hist1h1c histone cluster 1, H1c 0.8003920
ENSMUSG00000015957 199.62821 1.3588321 0.2812957 31.85778 6.00e-07 0.0016989 Wnt11 wingless-type MMTV integration site family, member 11 1.3588321
ENSMUSG00000083670 31.92052 8.0509876 1.5994637 31.65871 6.00e-07 0.0016989 NA NA 8.0509876
ENSMUSG00000029651 53.47416 -2.4367913 0.5825888 30.60947 1.00e-06 0.0025117 Mtus2 microtubule associated tumor suppressor candidate 2 2.4367913
ENSMUSG00000026051 154.27817 -1.7647000 0.3524652 28.65756 2.60e-06 0.0058162 1500015O10Rik RIKEN cDNA 1500015O10 gene 1.7647000
ENSMUSG00000032754 292.25578 -0.2370884 0.1961784 26.82702 6.40e-06 0.0128041 Slc8b1 solute carrier family 8 (sodium/lithium/calcium exchanger), member B1 0.2370884
ENSMUSG00000032487 56.09453 -1.2328051 0.3928699 26.48593 7.50e-06 0.0138369 Ptgs2 prostaglandin-endoperoxide synthase 2 1.2328051
ENSMUSG00000067344 25.85143 4.4037536 1.1508516 26.11994 9.00e-06 0.0152382 NA NA 4.4037536
ENSMUSG00000014813 487.60607 -1.9735020 0.4110123 25.30678 1.33e-05 0.0195423 Stc1 stanniocalcin 1 1.9735020
ENSMUSG00000082361 414.91219 -1.4051898 0.2835380 25.32573 1.32e-05 0.0195423 Btc betacellulin, epidermal growth factor family member 1.4051898
ENSMUSG00000062683 112.20944 10.2865518 2.1356198 25.11978 1.46e-05 0.0200473 Atp5g2 ATP synthase, H+ transporting, mitochondrial F0 complex, subunit C2 (subunit 9) 10.2865518
ENSMUSG00000025155 696.96438 0.2600794 0.1055875 24.35990 2.10e-05 0.0243356 Dus1l dihydrouridine synthase 1-like (S. cerevisiae) 0.2600794
ENSMUSG00000082820 17.79334 -1.4058141 0.8192381 24.44528 2.02e-05 0.0243356 NA NA 1.4058141
ENSMUSG00000096054 122.25463 -1.0371696 0.2376876 24.56297 1.91e-05 0.0243356 Syne1 spectrin repeat containing, nuclear envelope 1 1.0371696
ENSMUSG00000073643 1237.69324 0.8044436 0.4554556 23.14621 3.76e-05 0.0414239 Wdfy1 WD repeat and FYVE domain containing 1 0.8044436

GO ontology of DE genes (log2FC > 0 and FDR < 0.1 ): 33

ID Description GeneRatio BgRatio pvalue p.adjust qvalue
GO:0006814 GO:0006814 sodium ion transport 5/29 198/20996 7.00e-06 0.0038892 0.0030372
GO:0035725 GO:0035725 sodium ion transmembrane transport 4/29 99/20996 1.01e-05 0.0038892 0.0030372

This will detect genes that change differently over time in the two conditions.

Clustering in common patterns

We used diana function inside cluster R package to separate genes using the expression correlation with time. Clusters with more than 3 genes are shown. Significant genes were those with log2FC bigger than 0.1 and FDR < 5%. The file with the information of this analysis is clusters_genes.tsv.

Using expression

Working with 33 genes

Working with 23 genes after filtering

Using fold change with two conditions



Working with  33  genes 



 Working with  20 genes after filtering

Using fold change at each time point



Working with  33  genes 



 Working with  25 genes after filtering

R Session Info

(useful if replicating these results)

R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.12.1 (Sierra)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    methods   stats     graphics  grDevices utils    
[8] datasets  base     

other attached packages:
 [1] RUVSeq_1.6.2               EDASeq_2.6.2              
 [3] ShortRead_1.30.0           GenomicAlignments_1.8.4   
 [5] Rsamtools_1.24.0           Biostrings_2.40.2         
 [7] XVector_0.12.1             BiocParallel_1.6.6        
 [9] vsn_3.40.0                 DEGreport_1.11.2          
[11] quantreg_5.29              SparseM_1.72              
[13] dplyr_0.5.0                cluster_2.0.5             
[15] org.Mm.eg.db_3.3.0         AnnotationDbi_1.34.4      
[17] clusterProfiler_3.0.5      DOSE_2.10.7               
[19] gridExtra_2.2.1            logging_0.7-103           
[21] tximport_1.0.3             DESeq2_1.12.4             
[23] SummarizedExperiment_1.2.3 Biobase_2.32.0            
[25] GenomicRanges_1.24.3       GenomeInfoDb_1.8.7        
[27] IRanges_2.6.1              S4Vectors_0.10.3          
[29] BiocGenerics_0.18.0        pheatmap_1.0.8            
[31] CHBUtils_0.1               edgeR_3.14.0              
[33] limma_3.28.21              gplots_3.0.1              
[35] reshape_0.8.6              ggplot2_2.2.0             
[37] myRfunctions_0.1           knitr_1.15.1              
[39] rmarkdown_1.1             

loaded via a namespace (and not attached):
 [1] colorspace_1.2-7       hwriter_1.3.2          qvalue_2.4.2          
 [4] MatrixModels_0.4-1     topGO_2.24.0           affyio_1.42.0         
 [7] codetools_0.2-15       splines_3.3.1          R.methodsS3_1.7.1     
[10] GOSemSim_1.30.3        DESeq_1.24.0           geneplotter_1.50.0    
[13] Formula_1.2-1          Nozzle.R1_1.1-1        annotate_1.50.1       
[16] GO.db_3.3.0            R.oo_1.20.0            graph_1.50.0          
[19] readr_1.0.0            assertthat_0.1         Matrix_1.2-7.1        
[22] lazyeval_0.2.0         acepack_1.3-3.3        htmltools_0.3.5       
[25] tools_3.3.1            igraph_1.0.1           coda_0.19-1           
[28] gtable_0.2.0           affy_1.50.0            reshape2_1.4.1        
[31] DO.db_2.9              Rcpp_0.12.7            gdata_2.17.0          
[34] preprocessCore_1.34.0  rtracklayer_1.32.2     stringr_1.1.0         
[37] gtools_3.5.0           XML_3.98-1.4           MASS_7.3-45           
[40] zlibbioc_1.18.0        scales_0.4.1           aroma.light_3.2.0     
[43] BiocInstaller_1.22.3   RColorBrewer_1.1-2     yaml_2.1.14           
[46] biomaRt_2.28.0         rpart_4.1-10           latticeExtra_0.6-28   
[49] stringi_1.1.2          RSQLite_1.0.0          highr_0.6             
[52] genefilter_1.54.2      GenomicFeatures_1.24.5 caTools_1.17.1        
[55] chron_2.3-47           matrixStats_0.51.0     bitops_1.0-6          
[58] evaluate_0.10          lattice_0.20-34        labeling_0.3          
[61] GSEABase_1.34.1        plyr_1.8.4             magrittr_1.5          
[64] R6_2.2.0               Hmisc_3.17-4           DBI_0.5-1             
[67] foreign_0.8-67         survival_2.39-5        RCurl_1.95-4.8        
[70] nnet_7.3-12            tibble_1.2             KernSmooth_2.23-15    
[73] locfit_1.5-9.1         grid_3.3.1             data.table_1.9.6      
[76] digest_0.6.10          xtable_1.8-2           tidyr_0.6.0           
[79] R.utils_2.4.0          munsell_0.4.3